Crafting a multi-task CNN for viewpoint estimation
نویسندگان
چکیده
Convolutional Neural Networks (CNNs) were recently shown to provide state-of-theart results for object category viewpoint estimation. However different ways of formulating this problem have been proposed and the competing approaches have been explored with very different design choices. This paper presents a comparison of these approaches in a unified setting as well as a detailed analysis of the key factors that impact performance. Followingly, we present a new joint training method with the detection task and demonstrate its benefit. We also highlight the superiority of classification approaches over regression approaches, quantify the benefits of deeper architectures and extended training data, and demonstrate that synthetic data is beneficial even when using ImageNet training data. By combining all these elements, we demonstrate an improvement of approximately 5% mAVP over previous state-of-the-art results on the Pascal3D+ dataset [29]. In particular for their most challenging 24 view classification task we improve the results from 31.1% to 36.1% mAVP.
منابع مشابه
HyperFace: A Deep Multi-task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition
We present an algorithm for simultaneous face detection, landmarks localization, pose estimation and gender recognition using deep convolutional neural networks (CNN). The proposed method called, Hyperface, fuses the intermediate layers of a deep CNN using a separate CNN and trains multi-task loss on the fused features. It exploits the synergy among the tasks which boosts up their individual pe...
متن کاملIs 2D Information Enough For Viewpoint Estimation?
What does this paper demonstrate. We show that a very simple 2D architecture (in the sense that it does not make any assumption or reasoning about the 3D information of the object) generally used for object classification, if properly adapted to the specific task, can provide top performance also for pose estimation. More specifically, we demonstrate how a 1-vs-all classification framework base...
متن کاملAge Estimation by Multi-scale Convolutional Network
In the last five years, biologically inspired features (BIF) always held the state-of-the-art results for human age estimation from face images. Recently, researchers mainly put their focuses on the regression step after feature extraction, such as support vector regression (SVR), partial least squares (PLS), canonical correlation analysis (CCA) and so on. In this paper, we apply convolutional ...
متن کاملEffective training of convolutional neural networks for face-based gender and age prediction
Convolutional Neural Networks (CNNs) have been proven very effective for human demographics estimation by a number of recent studies. However, the proposed solutions significantly vary in different aspects leaving many open questions on how to choose an optimal CNN architecture and which training strategy to use. In this work, we shed light on some of these questions improving the existing CNN-...
متن کاملNext-Flow: Hybrid Multi-Tasking with Next-Frame Prediction to Boost Optical-Flow Estimation in the Wild
CNN-based optical flow estimation has attracted attention recently, mainly due to its impressively high frame rates. These networks perform well on synthetic datasets, but they are still far behind the classical methods in realworld videos. This is because there is no ground truth optical flow for training these networks on real data. In this paper, we boost CNN-based optical flow estimation in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1609.03894 شماره
صفحات -
تاریخ انتشار 2016